|
In statistics, a probit model is a type of regression where the dependent variable can only take two values, for example married or not married. The name is from ''probability'' + ''unit''.〔''Oxford English Dictionary'', 3rd ed. s.v. ''probit'' (article dated June 2007): 〕 The purpose of the model is to estimate the probability that an observation with particular characteristics will fall into a specific one of the categories; moreover, if estimated probabilities greater than 1/2 are treated as classifying an observation into a predicted category, the probit model is a type of binary classification model. A probit model is a popular specification for an ordinal〔Ordinal probit regression model UCLA Academic Technology Services http://www.ats.ucla.edu/stat/stata/dae/ologit.htm〕 or a binary response model. As such it treats the same set of problems as does logistic regression using similar techniques. The probit model, which employs a probit link function, is most often estimated using the standard maximum likelihood procedure, such an estimation being called a probit regression. Probit models were introduced by Chester Bliss in 1934; a fast method for computing maximum likelihood estimates for them was proposed by Ronald Fisher as an appendix to Bliss' work in 1935. ==Conceptual framework== Suppose response variable ''Y'' is ''binary'', that is it can have only two possible outcomes which we will denote as 1 and 0. For example ''Y'' may represent presence/absence of a certain condition, success/failure of some device, answer yes/no on a survey, etc. We also have a vector of regressors ''X'', which are assumed to influence the outcome ''Y''. Specifically, we assume that the model takes the form : where Pr denotes probability, and Φ is the Cumulative Distribution Function (CDF) of the standard normal distribution. The parameters ''β'' are typically estimated by maximum likelihood. It is possible to motivate the probit model as a latent variable model. Suppose there exists an auxiliary random variable : where ''ε'' ~ ''N''(0, 1). Then ''Y'' can be viewed as an indicator for whether this latent variable is positive: : The use of the standard normal distribution causes no loss of generality compared with using an arbitrary mean and standard deviation because adding a fixed amount to the mean can be compensated by subtracting the same amount from the intercept, and multiplying the standard deviation by a fixed amount can be compensated by multiplying the weights by the same amount. To see that the two models are equivalent, note that : 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「probit model」の詳細全文を読む スポンサード リンク
|